Reviving Sequential Program Birthmarking for Multithreaded Software Plagiarism Detection
نویسندگان
چکیده
As multithreaded programs become increasingly popular, plagiarism of multithreaded programs starts to plague the software industry. Although there has been tremendous progress on software plagiarism detection technology, existing dynamic birthmark approaches are applicable only to sequential programs, due to the fact that thread scheduling nondeterminism severely perturbs birthmark generation and comparison. We propose a framework called TOB (Thread-oblivious dynamic Birthmark) that revives existing techniques so they can be applied to detect plagiarism of multithreaded programs. This is achieved by threadoblivious algorithms that shield the influence of thread schedules on executions. We have implemented a set of tools collectively called TOB-PD (TOB based Plagiarism Detection tool) by applying TOB to three existing representative dynamic birthmarks, including SCSSB (System Call Short Sequence Birthmark), DYKIS (DYnamic Key Instruction Sequence birthmark) and JB (an API based birthmark for Java). Our experiments conducted on large number of binary programs show that our approach exhibits strong resilience against state-of-the-art semantics-preserving code obfuscation techniques. Comparisons against the three existing tools SCSSB, DYKIS and JB show that the new framework is effective for plagiarism detection of multithreaded programs. The tools, the benchmarks and the experimental results are all publicly available.
منابع مشابه
Exploiting thread-related system calls for plagiarism detection of multithreaded programs
Dynamic birthmarking used to be an effective approach to detecting software plagiarism. Yet the new trend towards multithreaded programming renders existing algorithms almost useless, due to the fact that thread scheduling nondeterminism severely perturbs birthmark generation and comparison. In this paper, we redesign birthmark based software plagiarism detection algorithms to make such approac...
متن کاملDetecting Software Theft with API Call Sequence Sets
Software birthmarking uses a set of unique characteristics every program has upon creation to justify ownership claims of thefted software. This paper presents a novel birthmarking technique based on the interaction of a program with the standard API. We have used this technique to succesfully distinguish 4 different implementations of PNG image processing.
متن کاملDetecting Software Theft via Whole Program Path Birthmarks
A software birthmark is a unique characteristic of a program that can be used as a software theft detection technique. In this paper we present and empirically evaluate a novel birthmarking technique — Whole Program Path Birthmarking — which uniquely identifies a program based on a complete control flow trace of its execution. To evaluate the strength of the proposed technique we examine two im...
متن کاملDesktop Tools for Offline Plagiarism Detection in Computer Programs
Plagiarism in universities has always been a difficult problem to overcome. Various tools have been developed over the past few years to help teachers detect plagiarism in students’ work. By being able to categorize the multitude of plagiarism detection tools, it is possible to estimate their capabilities, advantages and disadvantages. In this article I consider modern plagiarism software solut...
متن کاملSoftware metrics and plagiarism detection
The reliability of plagiarism detection systems, which try to identify similar programs in large populations, is critically dependent on the choice of program representation. Software metrics conventionally used as representations are described, and the limitations of metrics adapted from software complexity measures are outlined. An applicationspecific metric is proposed, one that represents t...
متن کامل